An Evaluation of Yelp Dataset
نویسنده
چکیده
Yelp is one of the largest online searching and reviewing systems for kinds of businesses, including restaurants, shopping, home services et al. Analyzing the real world data from Yelp is valuable in acquiring the interests of users, which helps to improve the design of the next generation system. This paper targets the evaluation of Yelp dataset, which is provided in the Yelp data challenge. A bunch of interesting results are found. For instance, to reach any one in the Yelp social network, one only needs 4.5 hops on average, which verifies the classical six degree separation theory; Elite user mechanism is especially effective in maintaining the healthy of the whole network; Users who write less than 100 business reviews dominate. Those insights are expected to be considered by Yelp to make intelligent business decisions in the future.
منابع مشابه
CS224D Project Final Report Summarizing Reviews and Predicting Rating for Yelp Dataset
The report explores the use of Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN) in summarising text reviews and predicting review rating for the Yelp dataset. I use the fact that the reviews are labelled (by rating) to extract important sentences/words, which are then used as the summary for the review. I use an interesting evaluation technique to measure the relevance of...
متن کاملReviews and Neighbors Influence on Performance of Business
The task of rating prediction has been gaining popularity, especially after several companies come up with competitions, such as the Netflix Challenge and the Yelp dataset challenge[12]. In this paper, we aim to modify and incorporate two methods for rating prediction of businesses, one utilizes the semantics of the review text, while the other uses the influence of the surrounding businesses. ...
متن کاملCS224d Project Final Report
We develop a Recurrent Neural Network (RNN) Language Model to extract sentences from Yelp Review Data for the purpose of automatic summarization. We compare these extracted sentences against user-generated tips in the Yelp Academic Dataset using ROUGE and BLEU metrics for summarization evaluation. The performance of a uni-directional RNN is compared against word-vector averaging.
متن کاملAutomatic Business Attribute Labeling from Yelp Reviews
In this paper, we present a predictive model capable of assigning attributes to businesses based on their consumer reviews and evaluations. We utilized the Yelp Dataset Challenge [1] data sources containing approximately 140K businesses, 4 million reviews, 1 million tips, and 1 million business-related attribute tags. We transformed the review data into TF-IDF term-vectors as well as Word2Vec r...
متن کاملLink Prediction in Bipartite Networks - Predicting Yelp Reviews
In this paper, we aim to predict new user reviews on businesses in the Yelp social network. We formulate this as a network link prediction problem by modeling Yelp dataset as a bipartite network between users and businesses. We implement link prediction algorithms with various proximity metrics, thoroughly evaluate the effectiveness of each algorithm and conclude that Delta, AdamicAdar and Comm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1512.06915 شماره
صفحات -
تاریخ انتشار 2015